Model building with likelihood basis pursuit
نویسندگان
چکیده
We consider a non-parametric penalized likelihood approach for model building called likelihood basis pursuit (LBP) that determines the probabilities of binary outcomes given explanatory vectors while automatically selecting important features. The LBP model involves parameters that balance the competing goals of maximizing the log-likelihood and minimizing the penalized basis pursuit terms. These parameters are selected to minimize a proxy of misclassification error, namely, the randomized, generalized approximate cross validation (ranGACV) function. The ranGACV function is not easily represented in compact form; its functional values can only be obtained by solving two instances of the LBP model, which may be computationally expensive. A grid search is typically used to find appropriate parameters, requiring the solutions to hundreds or thousands of instances of the LBP model. Since only parameters (data) are changed between solves, the resulting problem is a nonlinear slice model in the parameter space. We show how slice-modeling techniques significantly improve the efficiency of individual solves and thus speed-up the grid search. In addition, we consider using derivative-free optimization algorithms for parameter selection, replacing the grid search. We show how, by seeding the derivative-free algorithms with a coarse grid search, these algorithms can find better solutions with fewer function evaluations. Our interest in this area comes directly from the seminal work that Olvi and his collaborators have carried out designing and applying optimization techniques to problems in machine learning and data mining.
منابع مشابه
Smoothing Spline ANOVA Models II. Variable Selection and Model Building via Likelihood Basis Pursuit
We describe Likelihood Basis Pursuit, a nonparametric method for variable selection and model building, based on merging ideas from Lasso and Basis Pursuit works and from smoothing spline ANOVA models. An application to nonparametric variable selection for risk factor modeling in the Wisconsin Epidemiological Study of Diabetic Retinopathy is described. Although there are many approaches to vari...
متن کاملVariable Selection and Model Building via Likelihood Basis Pursuit
Abstract This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis...
متن کاملOptimization of Slice Models
We consider solving large-scale mathematical programming problems quickly and efficiently as part of a larger system. By considering how the programming problems are integrated into their systems, we show how the overall solution process can become more efficient. Specifically, we consider three systems: slice modeling, model building and treatment planning. Slice modeling describes a system co...
متن کاملA Variety of Regularization Problems
Beginning with a review of some optimization problems in RKHS, and going on to a model selection problem via Likelihood Basis Pursuit (LBP).
متن کاملPredicting the Likelihood of Falls among the Elderly Using Likelihood Basis Pursuit Technique
This study reports on the application of the knowledge discovery in database process to generate models that can predict the likelihood of falls among the elderly who reside in long-term care facilities. This process was applied to data held in the Minimum Data Set, a comprehensive resident assessment instrument being used in all Medicare and Medicaid supported nursing homes in the United State...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Optimization Methods and Software
دوره 19 شماره
صفحات -
تاریخ انتشار 2004